Indefinite-Horizon POMDPs with Action-Based Termination

نویسنده

  • Eric A. Hansen
چکیده

For decision-theoretic planning problems with an indefinite horizon, plan execution terminates after a finite number of steps with probability one, but the number of steps until termination (i.e., the horizon) is uncertain and unbounded. In the traditional approach to modeling such problems, called a stochastic shortest-path problem, plan execution terminates when a particular state is reached, typically a goal state. We consider a model in which plan execution terminates when a stopping action is taken. We show that an action-based model of termination has several advantages for partially observable planning problems. It does not require a goal state to be fully observable; it does not require achievement of a goal state to be guaranteed; and it allows a proper policy to be found more easily. This framework allows many partially observable planning problems to be modeled in a more realistic way that does not require an artificial discount factor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Achieving goals in decentralized POMDPs

Coordination of multiple agents under uncertainty in the decentralized POMDP model is known to be NEXP-complete, even when the agents have a joint set of goals. Nevertheless, we show that the existence of goals can help develop effective planning algorithms. We examine an approach to model these problems as indefinite-horizon decentralized POMDPs, suitable for many practical problems that termi...

متن کامل

Improved Planning for Infinite-Horizon Interactive POMDPs using Probabilistic Inference (Extended Abstract)

We provide the first formalization of self-interested multiagent planning using expectation-maximization (EM). Our formalization in the context of infinite-horizon and finitely-nested interactivePOMDP (I-POMDP) is distinct from EM formulations for POMDPs and other multiagent planning frameworks. Specific to I-POMDPs, we exploit the graphical model structure and present a new approach based on b...

متن کامل

Genetic Algorithms for Approximating Solutions to POMDPs

We use genetic algorithms (GAs) to nd good nite horizon policies for POMDPs, where the search is limited to policies with a xed nite amount of policy memory. Initial results were presented in (Lusena et al. 1999) with one GA. In this paper, diierent cross-over and mutation rates are compared. Initializing the population of the genetic algorithm is done using smaller genetic algorithms. The sele...

متن کامل

Efficient Planning for Factored Infinite-Horizon DEC-POMDPs

Decentralized partially observable Markov decision processes (DEC-POMDPs) are used to plan policies for multiple agents that must maximize a joint reward function but do not communicate with each other. The agents act under uncertainty about each other and the environment. This planning task arises in optimization of wireless networks, and other scenarios where communication between agents is r...

متن کامل

Policy Filtering for Planning in Partially Observable Stochastic Domains

Partially observable Markov decision processes (POMDP) can be used as a model for planning in stochastic domains. This paper considers the problem of computing optimal policies for nite horizon POMDPs. In deciding on an action to take, an agent is not only concerned with how the action would a ect the current time point, but also its impacts on the rest of the planning horizon. In a POMDP, the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007